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fifi     Those  stimuli  which  increase  the  strength  of  the  responses  they  follow 

are  ordinarily  designated  reinforcers.    Traditionally,  two  types  of  rein- 

forcers  have  been  distinguished.     Primary  reinforcers  (S  s)  are  stimuli 

which  have  reinforcing  properties  at  the  beginning  of  an  experiment  and  can 

be  expected  to  hold  these  properties  throughout  the  experiment.  Conditioned 

x 

or  secondary  reinforcers  (S  s)  are  stimuli  which  do  not  have  reinforcing 
properties  at  the  beginning  of  an  experiment,  but  acquire  these  properties 
during  it. 

A  satisfactory  description  of  conditions  by  which  stimuli  acquire  and 
maintain  conditioned  reinforcing  properties  is  necessary  for  an  adequate 
theory  of  behavior  and  behavior  change.     That  these  conditions  have  not 
yet  been  adequately  specified  is  evidenced  by  the  lack  of  apparent  agree- 
ment among  the  results  of  various  experiments.     There  seems  to  be  general 
agreement,  however,  that  stimuli  become  conditioned  reinforcers  by  pre- 
ceding other  reinforcers.     The  present  paper  will  describe  three  procedures 
which  have  been  used  to  study  the  variables  affecting  the  conditioned 
reinforcing  value  of  stimuli.     The  problems  with  the  extinction  test,  and 
with  chaining  procedures  will  be  described  briefly.    Then  problems  with 
the  new  response  procedure,  which  is  used  here,  will  be  specified.     Only  a 
few  experiments  representative  of  each  procedure  will  be  covered. 

Note  that  we  are  not  concerned  here  with  Pavlovian  higher  order  con- 
ditioning, in  which  response- independent  procedures  are  employed  throughout. 
The  present  concern  is  with  those  procedures  which  measure  the  effect  of 
antecedent  training  on  the  tendency  for  S    to  increase  or  maintain  the 
strength  of  the  response  it  follows.     That  is,  the  concern  is  with  a  rein- 
forcer  in  the  sense  of  a  reward  and  not  in  the  sense  of  an  unconditioned 
stimulus . 
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With  the  extinction  test  for  conditioned  reinforcement,  response 
strength  above  baseline  is  first  produced  by  repeatedly  following  re- 
sponses with  primary  reward.     A  neutral  stimulus  acquires  conditioned 
reinforcing  properties  when  it  is  presented  between  the  response  and  S 
on  acquisition  trials,  or  in  separate  pairings  with  S  .     The  primary  reward 

is  then  omitted,  and  the  rate  of  response  is  measured  when  responses  are 

r  r 

followed  only  by  S  .     Greater  resistance  to  extinction  when  S    is  contingent 

r 

on  a  response  than  for  a  control  without  S    is  usually  attributed  to  the 
conditioned  reinforcing  effect  of  S  . 

One  of  the  earlier  experiments  with  this  technique  was  Bugelski's 
(1938).    He  trained  hungry  rats  to  press  a  bar  for  food.     Each  food 
delivery  was  preceded  by  an  audible  click.     The  rats  were  then  divided 
into  two  groups  for  extinction.     Bar  presses  of  the  control  group  had  no 
effect,  while  bar  presses  of  the  experimental  group  produced  the  click 
but  no  food.     The  experimental  group  made  significantly  more  responses  in 
extinction  than  the  control  group.     Bugelski  inferred  that  the  click  served 
as  a  sub-goal  or  conditioned  reinforcer  in  extinction. 

Later  experimenters  noted  that  Bugelski's  results  were  open  to  other 
interpretations.     Wyckoff,  Sidowski,  and  Chambliss  (1958)  suggested  that 
the  click  might  serve  as  a  positive  discriminative  stimulus,  rather  than 
a  conditioned  reinforcer.     Experiment  II  of  Wyckoff  et.  al.  was  a  test 
of  this  hypothesis.     Here  rats  were  trained  and  divided  into  two  groups 
for  an  extinction  test.     These  two  groups  differed  only  in  the  temporal 
relationship  between  a  barpress  and  a  buzz  which  had  preceded  water 
presentations.     In  the  experimental  group,  a  rat  received  the  buzz  following 
each  barpress.     Each  rat  in  the  control  group  was  yoked  to  a  rat  in  the 
experimental  group.     Each  time  the  experimental  rat  pressed  the  bar,  it's 
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control  counterpart  received  the  buzz,  provided  that  the  latter  had  not 
pressed  the  bar  in  the  preceding  ten  sec.    The  two  groups  did  not  differ 
significantly  in  mean  rate  of  responding  during  extinction.    Thus,  no 
secondary  reinforcing  effect  of  the  buzzer  was  demonstrated. 

There  are  two  major  problems  with  the  extinction  test  of  conditioned 
reinforcement.    First,  during  the  course  of  the  extinction  test  the  con- 
ditioned reinforcing  value  of  Sr  may  extinguish  quite  rapidly.    Thus  the 

extinction  test  may  not  be  sensitive  to  the  conditioned  reinforcing  value 
x 

of  the  S    established  during  training.     The  test  is  sensitive,  however, 
to  variables  which  effect  the  similarity  of  testing  and  training  trials. 
Variables  like  partial  reinforcement  (of  both  response  and  of  Sr)  can 
exert  a  great  deal  of  influence  on  the  test . 

Chaining  procedures  are  another  way  to  study  conditioned  reinforcement. 
In  chaining  procedures,  one  or  more  stimuli  precede  the  primary  reward. 
Transition  to  successive  stimuli  and  access  to  the  primary  reward  are 
each  contingent  on  a  response.    Kelleher  and  Gollub  (1962)  lauded  the 
chaining  procedure  as  a  means  of  testing  the  conditioned  reinforcing 
effects  of  S    without  removing  it  from  the  training  sequence. 

Following  the  final  link  in  the  chain  with  primary  reward,  however, 
introduces  additional  problems.     First,  it  may  be  difficult  to  determine 
whether  a  reinforcing  effect  is  attributable  to  the  conditioned  reinforcing 
stimuli  or  a  direct  incremental  effect  of  the  primary  reward  which  follows. 
Second,  when  the  increasing  proximity  to  a  reward  influences  the  operant 
level  of  responses  used  in  the  chain,  increases  in  rate  due  to  the  strength 
of  S    are  confounded  with  increases  in  operant  level  associated  with  proximity 


to  primary  reward,  i.e.  to  stimulus-reinforcer  effects.     Staddon  and 
Siramelhag  (1971)  demonstrated  that  the  peck,  commonly  studied  in  chaining, 
is  particularly  sensitive  to  changes  in  food  proximity  and  probability. 
In  their  experiment,  an  observer  recorded  the  behaviors  of  food  deprived 
pigeons  under  three  schedules  of  access  to  food.     Of  particular  interest 
here  is  the  behavior  of  birds  under  the  Fixed  Time  (FT)  12  sec.  schedule. 
(Staddon  and  Simmelhag  called  this  Fixed  Interval  12  sec.  although  the  more 
common  designation  of  the  response- independent  schedule  is  FT.)    Here,  the 
probability  of  a  peck  (to  the  food  hopper  wall)  increased  as  the  time  of  food 
delivery  approached,  independent  of  any  differential  reinforcement  for 
pecking.    Thus  there  may  be  strong  response- independent  effects  during  the 
chaining  procedures  which  would  introduce  a  confound. 

In  addition,  generalization  of  response  tendencies  complicates  the 
interpretation  of  the  results  when  the  response  preceding  primary  reward 
is  the  same  as  the  response  preceding  the  earlier  links.    To  the  extent 
that  environmental  conditions  preceding  the  primary  reward  and  the  con- 
ditions proceding  earlier  links  are  similar,  response  tendencies  will 
generalize  from  the  former  to  the  latter. 

Finally,  there  is  the  acquisition  of  a  new  response  measure  of 

conditioned  reinforcement.     This  procedure  involves  separate  training  and 

x 

testing  trials.     On  the  training  trials,  S  is  repeatedly  followed  by 

primary  reward.     On  test  trials,  S    is  made  contingent  on  a  response,  and 

the  measure  of  conditioned  reinforcement  is  the  increase  in  the  strength 

R  r 

of  the  response.     Because  S    never  follows  S    on  test  trials,  it  is  possible 
to  rule  out  direct  reinforcing  effects  of  S  . 
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The  results  of  many  experiments  using  the  new  response  technique 
do  not  appear  to  lead  to  clear  cut  conclusions.     The  results  of  Wyckoff, 
Sidowski,  and  Chambliss's  Experiment  II,  for  example,  indicate  no  conditioned 
reinforcing  effect  of  a  stimulus  previously  paired  with  primary  reward.  In 
this  experiment,  thirsty  rats  were  first  presented  several  times  with  a 
buzzer  which  was  immediately  followed  by  water.     The  rats  were  then 
divided  into  two  groups.     Rats  in  the  experimental  group  received  the 
buzzer  for  each  barpress.     Rats  in  the  control  group  received  the  buzzer 
on  an  FT  1  min.  schedule  provided  they  had  not  pressed  the  bar  in  the 
preceding  ten  sec.     Response  rates  in  the  experimental  group  were  no 
different  from  those  in  the  control  group.     However,  in  an  experiment  by 
Saltzman  (1949)  rats  acquired  a  choice  response  as  rapidly  when  it  was 
followed  by  a  conditioned  reinforcement  as  when  it  was  followed  by  a 
primary  reward. 

In  the  new  response  procedure  the  secondary  reinforcing  effect  must 
generalize  from  training  to  test  trials.     Although  the  environmental  conditions 
during    S    are  identical  on  training  and  on  test,  the  contexts  of  the 
respective  presentations  may  differ.     As  an  example,  consider  training 
trials  and  test  trials  presented  in  separate  phases.     Here,  training  stimuli 
and  test  stimuli  might  be  discriminable  because  they  are  temporally  discrete 
and  reinforcing  properties  gained  in  the  former  may  not  generalize  to  the 
latter. 

Consider  a  pair  of  experiments  by  Saltzman  (1949) .     The  experiments 
included  four  groups  which  differed  with  respect  to  training,  however  we 
will  be  concerned  here  only  with  the  two  groups  receiving  continuous 
reinforcement  (CRF)  training.     Phase  I  training  trials  took  place  in  a 
runway.     For  Group  1  and  Group  C,  each  of  5  daily  trials  led  to  a  goal 
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box  consistently  baited  with  food.    The  goal  box  was  the  same  brightness 
on  all  trials  for  a  given  subject  (either  black  or  white).     Phase  II  test 
trials  took  place  in  a  single  choice  maze.     For  Group  1,  one  goal  box 
of  the  maze  was  black  and  one  was  white.     Rats  obtained  no  food  on  test 
trials.    For  Group  C,  both  goal  boxes  were  the  brightness  not  used  in 
training  trials  and  one  of  the  boxes  was  consistently  baited  with  food. 
Fifteen  test  trials  were  run.     Group  C  selected  the  rewarded  arm  signifi- 
cantly more  than  half  the  time.     Group  1,  however,  did  not  select  the 
arm  leading  to  the  color  previously  rewarded  significantly  above  chance. 

In  Saltzman's  second  experiment,  rats  in  the  experimental  groups 
received  a  food  reinforced  runway  trial  after  each  test  trial.  This 
procedure  might  increase  the  choice  of  Sr  in  Group  1  in  two  ways.  First, 
including  food  trials  in  the  test  block  increases  it's  similarity  to  the 
training  block  in  which  food  was  present  after  each  of  the  trials. 
Second,  the  interspersed  food  trials  might  recondition  the  conditioned 
reinforcing  effect  of  the  goal  box  cues  as  they  extinguish  over  test 
trials.     Indeed,  in  Experiment  II,  Group  1  chose  Sr  significantly  more  than 
half  of  the  time. 

In  Saltzman's  experiments  discussed  here,  the  physical  context  of 
r 

presentation  of  S    differed  on  training  and  on  test  trials  also  (i.e., 
training  takes  place  in  a  runway,  and  testing  in  a  single  choice  maze). 
It  is  not  unreasonable  to  assume  that  this  facilitates  a  discrimination 
between  training  and  test  trials  and  as  a  consequence  reduces  generali- 
zation of  effects  from  the  former  to  the  latter. 

There  is  one  way  in  which  training  trials  and  test  trials  necessarily 
differ  with  the  new  response  procedure.     Test  presentations  of  Sr  are 
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always  preceded  by  a  response  and  usually  by  a  stimulus  controlling  tbe 

response.    On  the  other  hand,  training  trials  are  preceded  by  the 

inter-trial  interval  (ITI)  conditions.    Here,  at  the  time  of  termination 

of  S    the  two  types  of  trials  might  be  discriminated  on  the  basis  of  the 

short-term  memories  (STM)  of  different  antecedent  conditions.  The 
x 

S  -in-context  can  be  expected  to  acquire  differential  reinforcing  value  or 

attractiveness  on  the  two  types  of  trials  to  the  extent  that  the  memory 

r 

of  the  preceding  conditions  remains  a  salient  cue  until  S  termination. 

A  body  of  research  has  developed  on  the  STM  of  pigeons.  Several 
experiments  indicate  that  the  duration  of  the  memory  for  such  events  as 
brief  colored  key  stimuli  and  peck  responses  may  not  last  longer  than  a 
few  seconds  (Shimp,  1976,  Roberts  and  Grant,  1974).     If  applied  to  the  pre- 
ceding analysis  of  the  discrimination  of  conditions  on  the  two  types  of 

r 

trials,  the  STM  results  imply  that  for  S  s  of  longer  duration,  the  rein- 

r    R  r 
forcing  value  acquired  would  generalize  between  S  -S    trials  and  R-S 

trials.     Figure  1  describes  the  general  nature  of  the  relationship  assumed 

between  duration  of  stimuli  and  discriminability  of  conditions  just  before 

trial  termination.     If  S    is  sufficiently  long  conditions  may  not  be 

T  X"       R  T 

discriminably  different  on  the  R-S    and  S  -S    trials  at  the  time  of  S 

termination,   therefore  there  should  be  a  substantial  amount  of  general- 

t*     K.  r 
ization  between  S  -S    trials  and  R-S    trials  and  the  latter  trials  should 

have  a  strong  reinforcing  effect.  However,  if  S  is  very  brief,  the  con- 
ditions would  be  expected  to  be  discriminable  on  the  two  types  of  trials. 

To  the  extent  that  the  two  contexts  are  discriminated,  any  reinforcing 

r  R 

value  which  generalized  from  S  -S    trials  should  extinguish  rapidly. 
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The  present  experiment  is  designed  to  determine  whether  the  duration 

of  a  stimulus  used  in  the  acquisition  of  a  new  response  procedure  has  the 

expected  effect.     Food  deprived  pigeons  were  assigned  to  four  groups 

which  were  treated  identically  except  for  the  duration  of  the  red  and  green 
r 

stimuli  used  as  S  .    The  stimuli  were  of  30,  10,  3,  and  1  sec.  duration 
for  Groups  30,  10,  3,  and  1  respectively.    All  birds  were  exposed  to  a 
series  of  response-independent  trials  in  which  the  two  stimuli  on  the  key 
differentially  predicted  food  (e.g.,  the  probability  of  food  following  red 
was  .9  and  the  probability  of  food  following  green  was  .1).     In  a  later 
phase,  there  were  also  choice  trials  in  which  two  white  keys  served  as  the 
choice  stimuli.     Red  followed  a  peck  to  one  key  and  green  followed  a  peck 
to  the  other  key.     One  might  expect,  if  there  is  100%  generalization  between 
stimuli  on  response-independent  and  choice  trials,  that  red  would  be  the 
stronger  conditioned  reinforcer  and  therefore  would  come  to  be  selected 
more  often  on  the  choice  trials.     If,  on  the  other  hand,  red  following 
choice  is  so  different  from  red  on  the  response-independent  trials  that 
there  is  no  generalization  between  them,  red  and  green  would  be  chosen 
equally  often.     The  context  of  the  stimuli  on  response-independent  and 
choice  trials  differ  only  in  the  STM  of  their  antecedents  and  could  be 
discriminated  only  by  the  short-term  memory  cues.     On  the  assumption  that 
short-term  memories  for  the  antecedent  stimuli  only  last  between  3  to  30  sec, 
differential  predictions  can  be  made  for  the  four  groups.     Groups  1  and  3 
would  be  expected  to  discriminate  between  the  choice  red  and  the  response- 
independent  red  and  choose  red  only  half  the  time.     Group  30  would  be 
expected  not  to  discriminate  between  the  two  and  therefore  to  choose  red  on 
the  choice  trials  more  than  half  of  the  time.     The  results  for  Group  10  depend 
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on  the  duration  of  the  STM.     If  after  10  sec.  the  STM  for  antecedent 
conditions  provide  effective  differential  cues,  the  results  will  resemble 
those  for  Group  3.     If  the  STM  for  the  antecedent  dark  key  and  pecks  to 
white  key  do  not  persist  for  10  sec,  the  results  for  Group  10  will 
resemble  those  for  Group  30. 
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Method 

Subjects;     Data  are  presented  for  the  sixteen  pigeons  completing  the 
experiment.    All  of  these  experimentally  naive  birds  were  obtained  locally. 
Additional  birds  began  the  experiment,  but  did  not  complete  it  due  to 
equipment  problems.    There  was  no  indication  that  these  latter  birds 
performance  differed  in  any  way  from  those  completing  the  experiment. 

Apparatus :    All  sessions  for  Groups  30,  10,  and  3  were  conducted  in  two 
standard  Grason-Stadler  two  key  pigeon  chambers  located  in  sound  attenuating 
housings.     Group  1  received  keypeck  training,  however,  in  a  single  key  box. 
This  group  was  switched  to  the  two  key  boxes  after  keypeck  training. 
Inner  dimensions  of  the  single  key  box  were:  length  -  35.0  cm.,  width  - 
32.5  cm.,  and  height  -  30.5  cm.    The  keys  in  the  box  were  21.0  cm.  from 
the  floor  of  the  chamber.     The  3  watt  houselight  was  located  in  the  ceiling 
of  the  chamber,  behind  a  piece  of  translucent  plastic.     In  all  other  im- 
portant respects  the  single  key  box  was  like  the  two  key  boxes.     The  inner 
dimensions  of  the  two  key  chambers  were:  length  -  32.6  cm.,  width  -  32.0  cm., 
and  height  -  30.7  cm.     The  keys,   slightly  recessed  in  a  panel  perpendicular 
to  the  door  of  the  chamber,  were  translucent  and  were  transilluminated  by 
IEE  stimulus  projectors.     The  keys  were  25.0  cm.  from  the  floor  of  the 
chamber  and  5.8  cm.  apart,  center  to  center.     When  lit,  pressure  on  the 
key  with  a  force  of  .06  N  registered  as  a  peck  and  produced  a  feedback  click. 
An  opening  providing  access  to  the  food  hopper  was  located  beneath  and  be- 
tween the  two  keys.     The  Grason-Stadler  hopper  was  filled  with  a  50-50  mixture 
of  wheat  and  milo .    When  presented,  the  hopper  was  illuminated  by  a  1.1  watt 
bulb.    A  3  watt  houselight  was  mounted  on  a  panel  opposite  the  keys  at  a 
height  of  19.2  cm.     It  was  covered  by  an  upside-down  styrofoam  cup.  The 
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houselight  was  off  when  the  hopper  was  accessible  and  at  session  termination 
and  was  on  at  all  other  times.     A  ventilation  fan  and  a  speaker  transmitting 
white  noise  (each  located  behind  the  intelligence  panel)  served  to  mask 
extraneous  sounds.     Events  were  programmed  by  electro-mechanical  equipment 
located  in  an  adjoining  room.. 

Procedure:     The  birds  were  maintained  at  75%  of  free  feeding  weight  and 
individually  housed  in  a  room  with  constant  illumination.     They  had  free 
access  to  water  in  their  home  cages.    During  preliminary  training,  each 
bird  was  trained  to  eat  promptly  from  the  magazine  whenever  it  was  pre- 
sented. 

Groups  30,  10,  3,  and  1  received  the  colored  key  stimuli  for  30,  10, 
3,  and  1  sec.  respectively  throughout  all  phases  of  the  experiment.    Group  1 
was  run  after  the  other  3  groups,  and  differences  in  it's  treatment  will  be 
noted  where  appropriate.    Before  Phase  I  training  began,  brids  were  un- 
systematically  assigned  to  Groups  30,  10,  and  3.     Birds  for  Group  1  were 
selected  from  the  pigeon  colony  in  an  unsystematic  fashion  and  there  is 
no  reason  to  believe  that  they  differed  from  birds  in  the  other  groups  in 
any  relevant  way. 

On  the  day  following  completion  of  magazine  habituation,  Phase  I 
training  began.     Each  daily  session  of  Phase  I  consisted  of  40  single  key 
presentations  of  a  stimulus,  each  separated  by  a  dark  key  inter-trial 
interval  (ITI)  of  120  sec.     Red  and  green  stimuli  were  presented  equally 
often.     The  stimuli  were  presented  in  a  pseudo-random  order,  with  red 
and  green  appearing  equally  often  on  each  side.     Half  of  the  birds  in 
each  group  received  food  following  the  red  stimulus  nine  of  the  ten  times 
it  appeared  on  each  response  key,  and  following  the  green  stimulus  only  one 


12 


of  the  ten  times  it  appeared  on  each  key.     These  proportions  were  reversed 
for  the  remaining  birds  in  each  group.     Phase  I  training  continued  for 
each  bird  in  Groups  30,  10,  and  3  until  it  met  an  overall  rate  criterion 
of  9.6  pecks  per  min. 

Two  birds  in  Group  1  (#5  and  #6)  did  not  begin  pecking  when  exposed 
to  the  Phase  I  training  for  7  and  3  days  respectively.     For  this  reason, 
all  4  birds  in  Group  1  were  given  two  days  of  keypeck  training.  This 
training  took  place  in  the  single  key  chamber.     The  key  was  continuously 
illuminated  and  a  pattern  of  horizontal  lines  on  a  white  ground  was  pro- 
jected onto  the  key.     Initial  hand-shaping  was  followed  by  15  min.  of 
continuous  reinforcement  of  keypecking.     Day  2  consisted  of  15  min.  of 
CRF.    All  birds  in  Group  1  were  then  run  on  the  Phase  I  procedure  for 
two  days. 

Phase  II  began  immediately  at  completion  of  Phase  I.     In  Phase  II 
sessions,  there  were  40  response- independent  trials  as  in  Phase  I  but 
10  pairs  of  response-dependent  trials  were  randomly  interspersed.  The 
procedure  for  the  response-dependent  trials  is  represented  in  Fig.  2. 
Each  pair  consisted  of  a  choice  trial  followed  by  a  forced  trial  with  a 
dark  key  ITI  intervening.     Choice  trials  were  initiated  by  white  illumination 
of  both  keys.     When  a  key  was  pecked,  it  changed  to  the  appropriate  color 
(e.g.,  red  if  on  the  left  and  green  if  on  the  right)  for  the  appropriate 
duration  (1,  3,  10,  or  30  sec),  and  the  other  key  went  dark.     On  these 
trials,  food  followed  both  red  and  green  half  the  time.     A  forced  trial 
followed  each  choice  trial  by  120  sec.     Each  forced  trial  was  initiated 
by  white  illunination  of  the  key  which  tiad  not  been  pecked  on  the  preceding 
choice.     The  first  forced  trial  peck  changed  the  key  to  the  color  appropriate 
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for  that  side.     Food  was  presented  on  a  forced  trial  only  if  food  had  been 
presented  on  the  preceding  choice  trial.     The  key  color  followed  most 
often  by  food  and  the  color  which  appeared  on  the  right  were  fully  counter- 
balanced across  subjects  within  groups. 

Phase  II  training  continued  until  the  bird  met  the  criterion  of 
choice  of  the  .9  stimulus  on  19  of  the  20  choice  trials  on  two  successive 
days,  or  until  15  days  of  Phase  II  training  had  been  completed. 

Phase  III  training  began  on  the  session  following  completion  of 
Phase  II.     In  Phase  III  each  bird  received  a  color  side  pairing  on  choice 
trials  opposite  to  the  one  in  Phase  II,  i.e.  if  red  followed  a  peck  to  the 
right  in  Phase  II,  green  followed  a  peck  to  the  right  in  Phase  III.  The 
determinants  for  termination  of  Phase  III  were  the  same  as  those  for  Phase  II. 
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Results 

The  birds  in  Groups  30,  10,  and  3  required  from  one  to  five  days  to 
reach  the  criterion  of  9.6  pecks  per  min.  in  Phase  I.    The  median  number 
of  days  required  was  two.     No  systematic  differences  were  observed  among 
the  three  groups.    Phase  I  training  for  Group  1  has  been  described  in 
the  Method  section  of  this  paper. 

Percent  choice  of  .9  will  hereforward  designate  the  percent  of  the 
choice  trials  on  which  the  stimulus  followed  by  food  on  90%  of  the 
response-independent  trials  was  selected.     Fig.  3  shows  the  mean  daily 
percent  choice  of  .9  for  each  group  in  Phases  II  and  III.     Each  bird 
which  was  reversed  before  the  15  day  limit  was  assigned  100%  choice  of 
.9  for  all  subsequent  sessions.     This  was  not  expected  to  differentially 
influence    the  results  for  the  groups  unless  they  completed  training  at 
different  rates,  as  hypothesized.     Note  that  the  curves  for  Groups  10  and 
30  rose  to  between  80  and  100%  in  both  phases,  while  the  curves  for  Groups  1 
and  3  never  stayed  above  70%  for  two  consecutive  sessions.     Fig.  4  shows 
the  percent  choice  of  .9  as  a  function  of  days  for  individual  pigeons. 
In  Groups  10  and  30  combined,  5  of  8  birds  met  the  choice  criterion  in  both 
phases  within  15  days.     The  remaining  3  birds  met  the  criterion  in  only 
one  of  the  phases.     In  Groups  1  and  3  combined,  no  birds  met  the  criterion 
in  both  phases,  and  3  did  not  meet  the  criterion  in  either  phase.     Table  1 
shows  the  number  of  birds  in  the  two  duration  categories  (Groups  1  and  3 
v.s.  Groups  10  and  30)  which  met  the  criterion  in  both  phases  or  in  one  or 
fewer  of  the  phases.     The  durations  were  compared  on  the  number  in  each 
category  meeting  both  criteria  v.s.  the  number  meeting  no  more  than  one 
of  the  criteria.     A  Fisher's  exact  test  indicated  an  effect  of  duration 
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significant  at  p<.025.     Thus  Groups  10  and  30  met  criterion  in  both 
phases  significantly  more  often  than  Groups  1  and  3. 

Since  Group  1  received  pretraining  which  differed  from  that  given 
the  other  3  groups,  the  following  analyses  were  performed  first  using 
only  the  3  longer  durations  (Groups  3,  10,  and  30),  which  were  treated 
identically  except  for  the  value  of  the  independent  variable.     For  the 
first  analysis  Phases  II  and  III  were  combined,  to  minimize  the  effects 
of  side  preference  on  the  test  measures.    All  of  the  significance  tests 
performed  on  the  percent  choice  of  .9  data  use  only  data  from  the  last 
10  days  of  Phases  II  and  III.     The  10  day  figure  was  chosen  arbitrarily 
to  minimize  the  confounding  effects  of  relearning  the  color-side  pairing 
after  reversal.    To  produce  the  combined  score  the  mean  percent  choice  of 
.9  (over  the  last  10  days  of  the  phase)  was  averaged  over  Phases  II  and 
III  for  each  bird.     When  Phases  were  combined,  the  Kruskal-Wallis  one-way 
variance  analysis  indicated  a  significant  main  effect  (p<.02).     This  effect 
was  also  apparent  when  Phase  II  was  analysed  separately  (p<.05),  but  not 
for  Phase  III  alone  (.05<p<.10).    An  analysis  including  all  4  groups  pro- 
duced a  similar  pattern  of  results.    There  was  a  significant  effect  for 
Phases  II  and  III  when  combined  (p<.02),  however  this  effect  was  significant 
only  in  Phase  II  (p<.05)  when  the  phases  were  analysed  separately.  (For 
Phase  III  .10<p<.20) . 

Fig.  5  shows  the  mean  peek  rate  during  .9  and  .1  over  days  beginning 
with  the  first  Phase  II  session  for  Groups  30,  10,  and  3.     (The  rates  in 
Group  1  were  so  low  as  to  be  negligible.)     Note  that,  since  the  birds  in  a 
group  completed  the  phases  at  different  rates,  the  points  in  Fig.  5  are 
not  all  based  on  data  for  4  birds. 
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Two  major  trends  are  apparent  in  the  rate  data  of  Fig.  5.  First, 
the  mean  rates  in  Groups  10  and  3  are  higher  than  the  overall  rates 
in  Group  30.     This  is  consistent  with  data  on  rates  of  autopecking 
collected  by  Perkins,  Beavers,  Hancock,  Hemmendinger ,  Hemmendinger ,  and 
Ricci  (1975),  and  Terrace,  Gibbon,  Farrell,  and  Baldock  (1975).  The 
former  experimenters  found  that  rates  of  autopecking  were  lower  to  long 
duration  stimuli  than  they  were  to  short  duration  stimuli.     Terrace  et. 
al.  found  that  groups  of  birds  in  which  the  ITI  was  long  acquired  an 
antopeck  response  faster  and  attained  a  higher  asymptotic  rate  than  groups 
of  birds  which  had  a  short  ITI.     In  the  present  experiment,  in  Group  30 
the  mean  rate  to  the  .9  stimulus  increased  initially  but  dropped  off  as 
training  progressed.    Appendix  A  contains  plots  of  the  mean  pecks  per  sec. 
for  individual  birds  in  the  four  groups.     Notice  that  the  decrease  in  mean 
rate  to  .9  over  sessions  in  Group  30  is  largely  due  to  the  influence  of  two 
birds  (#J21  and  #Q8) .     Notice  also  that  birds  in  Group  30  approached 
asymptotic  rate  to  the  .1  stimulus  somewhat  more  slowly  than  they  approached 
asymptotic  rate  to  the  .9  stimulus. 

The  late  peak  of  the  peck  rate  to  the  .1  stimulus  in  Group  30  might  be 
interpreted  as  indicating  that  the  .1  stimulus  had  not  attained  its  maximal 
secondary  reinforcing  value  by  the  beginning  of  the  choice  trials. 
This  fact  might  contribute  to  the  increased  choice  of  .9  in  Group  30.  This 
cannot  account  entirely  for  the  present  results,  however,  since  each  bird 
in  Group  30  reached  asymptotic  rate  to  the  .1  stimulus  before  Phase  III  began. 
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Discussion 

The  present  results  indicate  that  long  duration  stimuli  (10  and  30 
sec.)  demonstrate  greater  conditioned  reinforcing  effects  than  short 
duration  stimuli  (1  and  3  sec.)-     This  result  was  predicted  on  the  basis  of 
assumptions  about  processes  which  might  effect  the  generalization  of 
attractiveness  from  training  to  testing  trials.     That  is,  with  the 
shorter  duration  stimuli,  the  different  antecedents  of  training  and  test 
trials  could,  through  'retention  in  memory'  provide  differential  cues 
throughout  the  stimulus.     With  the  longer  duration  stimuli,  however, 
there  is  less  probability  that  the  differential  cues  provided  by  STM  of 
antecedents  will  persist  until  stimulus  offset  and  trial  consequences 
take  effect. 

Several  experiments  indicate  that  the  probability  that  antecedent 
conditions  will  control  responding  is  a  decreasing  function  of  time  since 
exposure  to  those  conditions  (e.g.  Roberts,  1972,  Shimp,  1976).  The 
parameters  of  this  function  are  influenced  by  the  procedures  used  and  the 
nature  of  the  antecedent  conditions.     Experiments  using  the  delayed 
matching  to  sample  (DMTS)  procedure  typically  report  performance  at 
chance  levels  at  delays  longer  than  5  sec.   (e.g.,  Roberts,  1972,  Shimp, 
1976) .    However,  using  the  advance  procedure,  Honig  (1974)  found  performance 
nearly  undiminished  at  retention  intervals  up  to  20  sec.     In  addition, 
experiments  by  Roberts  and  Grant  (1974)  and  Shimp  (1976)  indicate  that  the 
accuracy  of  delayed  matching  to  sample  is  a  function  of  presentation  duration 
of  the  sample.     That  is,  as  the  exposure  time  of  the  antecedent  stimulus 
becomes  longer,  the  probability  that  it  will  be  correctly  matched  in- 
creases. 
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Although  there  are  experiments  in  which  antecedent  stimuli  have 
provided  the  basis  for  differential  responding  at  delays  up  to  20  sec, 
there  are  reasons  to  believe  that  the  conditions  of  the  present  experiment 
would  make  the  differential  cues  short  lived.     The  two  white  keys  which 
served  as  choice  stimuli  were  generally  of  short  duration.    Most  subjects 
faced  the  key  panel  at  all  times,  and  pecked  one  of  the  white  keys 
immediately  at  onset,  introducing  the  appropriate  colored  stimulus  on  that 
key.     In  addition,  the  dark  keys,  which  preceded  the  response-independent 
trials,  were  of  low  salience.    Therefore,  neither  the  dark  nor  the  white 
keys  are  likely  to  have  provided  differential  cues  for  as  much  as  30  sec. 

The  choice  response,  a  peck  to  the  left  or  the  right  key,  also 

preceded  the  S    on  choice  trials.     The  data  from  Shimp's  1976  experiment 

indicated  that  the  memory  for  stimuli  and  responses  was  relatively  brief, 

r 

less  than  6  sec.     In  addition,  pecking  to  the  S    itself  might  interfere 
with  the  memory  for  the  antecedent  choice  response  and  associated  stimuli. 

Finally,  there  was  little  or  no  incentive  for  discriminating  response- 
independent  from  choice  trials.     Choice  did  not  influence  the  probability 
that  food  followed  a  particular  trial.     If  incentive  is  a  factor  in 
'remembering',  then  the  memory  for  antecedent  conditions  in  the  present 
experiment  would  have  been  short-lived. 

In  summary,  results  of  memory  experiments  are  quite  consistent  with 
the  explanation  given  for  the  present  results. 

The  present  results  have  important  implications  for  research  on  con- 
ditioned reinforcement.     Earlier  reviewers  of  experiments  on  conditioned 
reinforcement  (Meyers,  1958,  Longstreth,  1971,  Schuster,  1969)  found  the 
phenomenon  to  be  inadequately  demonstrated.     These  reviewers  apparently 
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assumed,  however,  that  conditioned  reinforcing  effects  would  generalize 

r  R 
completely  from  training  trials  (where  S    is  followed  by  S  )  to  test 

r  R 
trials  (where  S    is  not  followed  by  S  ) .     Their  approach  runs  into 

problems  with  this  assumption.     The  fault  is  not  with  the  concept  of 

conditioned  reinforcement,  but  rather  with  the  failure  to  consider 

organismic  processes  in  its  acquisition  and  demonstration. 

It  is  difficult  to  evaluate  the  effect  that  discrimination  of  training 

from  test  trials  may  have  had  on  results  of  past  experiments.  Many 

published  reports  give  only  summary  scores,  such  as  means  or  medians  over 

the  total  R-S    trials.     An  experiment  by  Armus  and  Garlich  (1961)  does, 

however,  report  data  over  trials  or  days,  allowing  an  estimation  of  the 

effects  of  extinction.     In  the  Armus  and  Garlich  experiment  two  groups 

were  studied.     In  one  group  (Group  CRF) ,  a  light-sound  compound  stimulus 

(S  )  was  always  followed  by  food  on  training  trials.     In  another  group 

(Group  FR5) ,  food  followed  S    on  an  FR  5  schedule.     The  conditioned 

reinforcing  value  of  S    was  tested  in  a  two  bar  apparatus.     Bars  retracted 

for  6  sec.  upon  each  press.     Presses  to  one  of  the  bars  produced  S    on  a  CRF 

schedule.     The  percent  of  presses  to  the  S  -bar  was  used  as  a  measure  of 

r  r 

the  conditioned  reinforcing  effect  of  S  .     Group  CRF  never  chose  the  S  -bar 

significantly  more  than  chance  over  the  15  blocks  of  10  trials  each.  Group 
FR5,  however,  showed  a  peak  choice  on  Block  11  of  over  70%,  which  dropped 
off  to  50%  by  Block  15.     One  might  speculate  that  the  conditioned  reinforcing 
value  of  the  CRF  stimulus  extinguished  so  rapidly  that  its  reinforcing 
effect  never  showed  up  in  the  test.     The  conditioned  reinforcing  effect  of 
the  FR  5  stimulus,  however,  is  clearly  demonstrated  and  drops  off  after 
about  130  trials. 
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Any  approach  treating  the  S    as  though  identical  on  training  and  on 
test  trials  could  only  have  predicted  the  present  results  by  assuming  that 
long  duration  stimuli  either  acquire  greater  conditioned  reinforcing 
properties  or  they  acquire  these  properties  faster  than  the  shorter 
duration  stimuli.     Results  of  experiments  on  delay  of  reward,  however, 
lead  toward  the  opposite  conclusion.     A  short  delay  preceding  reward  is 
generally  chosen  over  a  longer  delay  when  the  absolute  amount  of  reward 
is  held  constant  (e.g.,  Schneider,  1972).     If  this  preference  is  transmitted 
to  the  delay  stimulus,  a  stimulus  associated  with  a  short  delay  to  reward 
would  become  a  stronger  conditioned  reinforcer  than  a  stimulus  associated 
with  a  longer  delay.    A  factor  of  this  sort  may  have  operated  in  this 
experiment.     Recall  that  in  Phase  II,  Group  10  reached  asymptote  in  mean 
choice  of  .9  long  before  Group  30.     In  fact,  Group  10  reached  the  maximum 
mean  choice  of  .9  within  4  days  of  the  start  of  Phase  II.     In  Phase  III 
however,  Groups  10  and  30  approached  asymptote  at  more  similar  rates. 
This  reduction  in  rate  of  acquisition  for  Group  10  might  reflect  the  cumulative 
effect  of  discrimination  of  training  from  test  trials. 

These  results  have  implications  which  reach  farther,  even,  than  research 
on  conditioned  reinforcement.     This  experiment  indicates  that  those 
theories  or  models  which  always  treat  the  experimental  situation  as  though 
consisting  of  a  set  of  component  stimuli  have  limited  usefulness.  The 
present  results  touch  on  issues  which  reach  further,  even,  than  conditioned 
reinforcement.     These  results  could  not  have  been  predicted  nor  can  they 
be  accounted  for  by  a  schema  which  deals  only  with  simple  component  stimuli. 
When  specified  in  terms  of  environmental  events,  ST  is  identical  on  training 
and  on  test  trials.     Thus  a  strict  component  view  would  predict  100%  general- 
ization between  the  two  types  of  trials.     One  might  account  for  the  behavior 
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of  birds  in  groups  1  and  3,  however,  in  terms  of  formation  of  a  conditional 
discrimination.    That  is,  when  the  colored  key  stimuli  are  differentially 
associated  with  food  they  are  preceded  by  the  ITI  conditions,  but  when 
each  are  followed  by  food  equally  often  they  are  preceded  by  the  white 
choice  stimuli  and  a  peck.     The  birds  in  Groups  1  and  3  behave  as  though 
a  discrimination  of  this  sort  has  been  learned.    Any  theory  postulating 
simple  component  stimuli  with  no  interaction  between  the  components 
(eg.  Rescorla  and  Wagner, 1972  )  could  not  account  for  this  result. 
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Figure  1  -  The  proposed  relationship  between 

the  duration  of  a  stimulus  and  the  discriminability 

of  the  STM  cues  for  antecedent  conditions. 
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Figure  2  -  The  sequence  of  events  for  the  response 
dependent  trials  of  Phases  II  and  III.  Half 
of  the  birds  received  the  illustrated  color- 
side  pairing  on  choice  trials  (i.e.,  red 
appeared  on  the  left  )  in  Phase  II  and  half  of 
them  received  the  reverse  pairing  (i.e.,  green 
appeared  on  the  left). 
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Figure  3  -  Mean  percent  choice  of  .9  over  the 
15  days  of  Phases  II  and  III  for  Groups  30,  10, 
3.  and  1. 
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Figure  4  -  The  percent  choice  of  .9  in  Phases  II 
and  III  for  individual  birds. 
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Figure  5  -  The  mean  peck  rate  over  days  for 
Groups  30 »  10  and  3  for  response-Independent 
and  response-dependent  trials  separately. 

Consecutive  days  are  plotted  irrespective  of 
phase.    Points  on  curves  to  the  right  of  arrows 
are  based  on  data  for  three  birds. 
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Table  1  -  The  number  of  birds  in  Groups  1 
and  3  and  in  Groups  10  and  30  meeting  both 
or  meeting  one  and  fewer  of  the  two  phase 
criteria. 
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APPENDIX  A.: 

Response  rates  for  individual  birds  in  Groups  3,  10,  and 
30  for  response-independent  and  response-dependent  trials 
separately. 
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Experiments  often  fail  to  show  that  stimuli  (S  s)  which  have  preceded 
primary  rewards  strengthen  new  responses.     If  conditioned  reinforcing 
properties  are  acquired  by  the  total  set  of  prevailing  cues  rather  than 
component  stimuli,     the  absence  of  sec.  reinforcing  effects  may  be 

explained  by  a  generalization  decrement  or  discrimination  between  con- 

r         r  r 
ditions  which  include  S    on  S  -reward  and  those  which  include  S  on 

r 

response-S    trials.     Since  the  two  types  of  trials  have  different  antece- 
dents this  might  result  from  a  difference  in  short-term  memory  (STM)  cues. 
An  experiment  was  designed  to  test  the  implication  of  this  view  that  when 
environmental  conditions  are  identical  on  both  kinds  of  trials,  pigeons 
will  acquire  new  responses  followed  by  S    better  as  S    duration  is 
increased  from  1  to  30  sec. 

Twelve  naive  pigeons  served  as  subjects.     The  red  and  green  stimuli 
r 

used  as  S    were  of  30,  10,  3,  and  1  sec.  duration  for  the  four  experimental 
groups.     The  1  Sec.  Group  was  added  after  the  other  groups  had  completed 
training.     Its  treatment    necessarily  differed  somewhat  from  theirs.  In 
Phase  I,  birds  received  response-independent  trials  in  which  red  and  green 
keylights  were  differential  predictors  of  food  (e.g.,  food  followed  red 
with  a  probability  of  .9,  and  green  with  a  probability  of  .1).     In  Phase  II, 
choice-forced  trial  pairs  were  interspersed  among  the  response-independent 
trials.     Choice  trials  were  signalled  by  white  illumination  of  the  two 
keys.     A  peck  to  one  key  produced  the  red  stimulus,  and  a  peck  to  the  other 
produced  the  green  stimulus.     On  the  forced  trial,  only  the  key  not 
previously  chosen  was  available.     Food  followed  choice  and  forced  trials 
with  a  probability  of  .5.     In  Phase  III,  the  color-side  pairing  on  choice 
trials  was  reversed. 
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The  stimulus  which  was  the  better  predictor  of  food  on  response- 
independent  trials  was  chosen  above  chance  by  Groups  30  and  10,  but  not 
by  Groups  3  and  1.     Kruskal  Wallis  nonparametric  analysis  of  variance 
showed  that  Group  30  and  10  chose  the  .9  stimulus  significantly  more  often 
than  did  Groups  3  and  1  (p<.Q5). 

These  results  indicate  that  conditioned  reinforcing  effects  are 
acquired  by  a  total  set  of  cues  (including  STM's)  present  on  S  -reward 
trials  and  not  by  component  stimuli  irrespective  of  context. 


